9 research outputs found

    Large scale data analysis using MLlib

    Get PDF
    Recent advancements in the internet, social media, and internet of things (IoT) devices have significantly increased the amount of data generated in a variety of formats. The data must be converted into formats that is easily handled by the data analysis techniques. It is mathematically and physically expensive to apply machine learning algorithms to big and complicated data sets. It is a resource-intensive process that necessitates a huge amount of logical and physical resources. Machine learning is a sophisticated data analytics technology that has gained in importance as a result of the massive amount of data generated daily that needs to be examined. Apache Spark machine learning library (MLlib) is one of the big data analysis platforms that provides a variety of outstanding functions for various machine learning tasks, spanning from classification to regression and dimension reduction. From a computational standpoint, this research investigated Apache Spark MLlib 2.0 as an open source, autonomous, scalable, and distributed learning library. Several real-world machine learning experiments are carried out in order to evaluate the properties of the platform on a qualitative and quantitative level. Some of the fundamental concepts and approaches for developing a scalable data model in a distributed environment are also discussed

    The classification of the modern arabic poetry using machine learning

    Get PDF
    In recent years, working on text classification and analysis of Arabic texts using machine learning has seen some progress, but most of this research has not focused on Arabic poetry. Because of some difficulties in the analysis of Arabic poetry, it was required the use of standard Arabic language on which “Al Arud”, the science of studying poetry is based. This paper presents an approach that uses machine learning for the classification of modern Arabic poetry into four types: love poems, Islamic poems, social poems, and political poems. Each of these species usually has features that indicate the class of the poem. Despite the challenges generated by the difficulty of the rules of the Arabic language on which this classification depends, we proposed a new automatic way of modern Arabic poems classification to solve these issues. The recommended method is suitable for the above-mentioned classes of poems. This study used Naïve Bayes, Support Vector Machines, and Linear Support Vector for the classification processes. Data preprocessing was an important step of the approach in this paper, as it increased the accuracy of the classification

    Security and accountability for sharing the data stored in the cloud‏

    Get PDF
    Important for cloud services the cloud computing share throw multiple clients , and it is more important to allocate resources for cloud service provider , cloud computing is an infrastructure that provides on demand network services , in relation , the most important feature of the cloud services is that user’s data are hosted in remote . While taking benefit of this new emerging technology, users’ fear of losing command of their own data, is becoming a noteworthy hurdle to the extensive implementation of cloud services. Cloud service provider module is to process data owner request for storing data files and application and provides cloud users log details to data owner for audit purpose, to address this problem framework based on information accountability to keep track and trial of the authentic handling of the users’ data in the cloud. The system proposed that the Data can be fully tracked by the owner and follow up the service agreements by depending on many items which access, usage control and management

    A new model for large dataset dimensionality reduction based on teaching learning-based optimization and logistic regression

    Get PDF
    One of the human diseases with a high rate of mortality each year is breast cancer (BC). Among all the forms of cancer, BC is the commonest cause of death among women globally. Some of the effective ways of data classification are data mining and classification methods. These methods are particularly efficient in the medical field due to the presence of irrelevant and redundant attributes in medical datasets. Such redundant attributes are not needed to obtain an accurate estimation of disease diagnosis. Teaching learning-based optimization (TLBO) is a new metaheuristic that has been successfully applied to several intractable optimization problems in recent years. This paper presents the use of a multi-objective TLBO algorithm for the selection of feature subsets in automatic BC diagnosis. For the classification task in this work, the logistic regression (LR) method was deployed. From the results, the projected method produced better BC dataset classification accuracy (classified into malignant and benign). This result showed that the projected TLBO is an efficient features optimization technique for sustaining data-based decision-making systems

    A comprehensive study: Ant Colony Optimization (ACO) for Facility Layout Problem

    Get PDF
    In context of manufacturing, numerous models are designed to appropriately represent the facility layout problem (FLP) and a variety of optimization methods have been applied to solve these models. The ultimate goal of these methods is to find optimal solutions, In regard to Swarm Intelligence (SI), Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) are regarded as the most important SI techniques of our time. In this paper, a brief introduction for the so far most promising approaches to facility layout related topics, are provided. The succeeding paper will then illustrate some of those, in more detail. Moreover, we examine ACO modifications and extensions that could contribute to optimization methods in FLP; mostly conform to NP-hard combinatorial problems. future research areas are identified in Construction Site Facility Layout Problems, Multi-Criteria Facility Layout Problems and Dynamic Facility Layout Problems

    Prompt Engineering: Guiding the Way to Effective Large Language Models

    No full text
    Large language models (LLMs) have become prominent tools in various domains, such as natural language processing, machine translation, and the development of creative text. Nevertheless, in order to fully exploit the capabilities of Language Models, it is imperative to establish efficient communication channels between humans and machines. The discipline of engineering involves the creation of well-constructed and informative prompts, which act as a crucial link between human intention and the execution of tasks by machines. The present study examines the concept of rapid engineering, elucidating its underlying concepts, methodologies, and diverse range of practical applications

    Distributed denial of service attack defense system-based auto machine learning algorithm

    No full text
    The use of network-connected gadgets is rising quickly in the internet age, which is escalating the number of cyberattacks. The detection of distributed denial of service (DDoS) attacks is a tedious task that has necessitated the development of a number of models for its identification recently. Nonetheless, because of major fluctuations in subscriptions and traffic rates, it continues to be a difficult challenge. A novel automatic detection technique was created to address this issue in this work, which reduces the feature space and consequently minimizes the computational time and model overfitting. Data preprocessing is done first to increase the model's generalizability; then, a feature selection method is used to choose the most pertinent features to increase the accuracy of the classification process. Additionally, hyperparameter tuning-choosing the proper parameters for the learning approach-improved model performance. Finally, the support vector machine (SVM) is compatible with the optimization and the hyperparameters offered by supervised learning methods. The CICDDoS2019 dataset was used to evaluate each of these assays, and the experimental findings demonstrated that, with an accuracy of 99.95%, the suggested model performs well when compared to more modern techniques
    corecore